Towards Fusion of Textual and Visual Modalities for Describing Audiovisual Documents

نویسندگان

Manel Fourati

Anis Jedidi

Hanen Ben Hassin

Faïez Gargouri

چکیده

Audiovisual documents provide a wide range of content description through more descriptors from different media types. Indeed, the extraction of these descriptions has received an increasing attention. But, the lack of semantic description always persists. In fact, this lack affects the retrieval process. To address this problem, this paper describes an automatic and semantic description of cinematic audiovisual documents. This description is based not only on the audiovisual flux in this post-production phase but also in the documentation in the pre-production phase by using textual and visual modalities. In this context, to extract content description, we find it is essential to extract texts superposed in the image. This process is mainly based on the neural network classifier. Moreover, an effective OCR (Tesseract) is adapted for texts recognition. Experiments results confirmed the interesting performance through two databases, namely, “ICDAR 2011” and our own created database from the Internet Movie Database Imdb. Towards Fusion of Textual and Visual Modalities for Describing Audiovisual Documents

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Comparative Analysis of the Effect of Visual and Textual Information on the Health Information Perception of High School Girl Students in Tehran

Purpose: Information and information sources can be divided into three broad categories according to their nature or type: textual information (book, journal article, conference paper, dissertation, newspaper, etc.), visual information (infographic, photo, Cartoons, films, etc.) and audiovisual information. The purpose of this study is to determine the effect of reading textual information in c...

متن کامل

Application of Topic Segmentation in Audiovisual Information Retrieval

Segmentation into topically coherent segments is one of the crucial points in information retrieval (IR). Suitable segmentation may improve the results of IR system and help users to find relevant passages faster. Segmentation is especially important in audiovisual recordings, in which the navigation is difficult. We present several methods used for topic segmentation, based on textual, audio a...

متن کامل

TEXTUAL AND INTER-TEXTUAL ANALYSES OF IRANIAN EFL UNDERGRADUATES’ TYPES OF ENGLISH READING TOWARDS DEVELOPING A CAREFUL READING FRAMEWORK

This study investigated textual and inter-textual reading of a group of Iranian EFL undergraduates’ careful English reading types. In this research, Khalifa and Weir’s (2009) reading framework was used to propose a more inclusive aspect of a careful reading framework and the reading construct for instructional and assessment goals. The participants of this study were B.A. students of English Tr...

متن کامل

Audiovisual temporal fusion in 6-month-old infants

The aim of this study was to investigate neural dynamics of audiovisual temporal fusion processes in 6-month-old infants using event-related brain potentials (ERPs). In a habituation-test paradigm, infants did not show any behavioral signs of discrimination of an audiovisual asynchrony of 200 ms, indicating perceptual fusion. In a subsequent EEG experiment, audiovisual synchronous stimuli and s...

متن کامل

End-to-End Audiovisual Fusion with LSTMs

Several end-to-end deep learning approaches have been recently presented which simultaneously extract visual features from the input images and perform visual speech classification. However, research on jointly extracting audio and visual features and performing classification is very limited. In this work, we present an end-to-end audiovisual model based on Bidirectional Long Short-Term Memory...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

IJMDEM

دوره 6 شماره

صفحات -

تاریخ انتشار 2015

Towards Fusion of Textual and Visual Modalities for Describing Audiovisual Documents

نویسندگان

چکیده

منابع مشابه

A Comparative Analysis of the Effect of Visual and Textual Information on the Health Information Perception of High School Girl Students in Tehran

Application of Topic Segmentation in Audiovisual Information Retrieval

TEXTUAL AND INTER-TEXTUAL ANALYSES OF IRANIAN EFL UNDERGRADUATES’ TYPES OF ENGLISH READING TOWARDS DEVELOPING A CAREFUL READING FRAMEWORK

Audiovisual temporal fusion in 6-month-old infants

End-to-End Audiovisual Fusion with LSTMs

عنوان ژورنال:

اشتراک گذاری